A 3 D - 1 D Substitution Matrix for Protein
نویسندگان
چکیده
In protein fold recognition, a probe amino acid sequence is compared to a library of representative folds of known structure to identify a structural homolog. In cases where the probe and its homolog have clear sequence similarity, traditional residue substitution matrices have been used to predict the structural similarity. In cases where the probe is sequentially distant from its homolog, we have developed a (7 x 3 x 2 x 7 x 3) 3D-1D substitution matrix (called H3P2), calculated from a database of 119 structural pairs. Members of each pair share a similar fold, but have sequence identity less than 30%. Each probe sequence position is deened by one of 7 residue classes and 3 secondary structure classes. Each homologous fold position is deened by one of 7 residue classes, 3 secondary structure classes, and 2 burial classes. Thus the matrix is 5-dimensional and contains 7 3 2 7 3 = 882 elements or 3D-1D scores. The rst step in assigning a probe sequence to its homologous fold is the prediction of the 3 state (helix, strand, coil) secondary structure of the probe; here we use the PHD program. Then a dynamic programming algorithm uses the H3P2 matrix to align the probe sequence with structures in a representative fold library. To test the eeectiveness of the H3P2 matrix a challenging, fold class diverse, and cross-validated benchmark assessment is used to compare the H3P2 matrix to the GONNET, PAM250, BLOSUM62 and a secondary structure only substitution matrix. For distantly related sequences the H3P2 matrix detects more homologous structures at higher reliabilities than do these other substitution matrices, based on sensitivity versus speciicity plots (or SENS-SPEC plots). The added eecacy of the H3P2 matrix arises from its information on the statistical preferences for various sequence-structure environment combinations from very distantly related proteins. It introduces the predicted secondary structure information from a sequence into fold 1 recognition in a statistical way that normalizes the inherent correlations between residue type, secondary structure and solvent accessibility.
منابع مشابه
Some remarks on the sum of the inverse values of the normalized signless Laplacian eigenvalues of graphs
Let G=(V,E), $V={v_1,v_2,ldots,v_n}$, be a simple connected graph with $%n$ vertices, $m$ edges and a sequence of vertex degrees $d_1geqd_2geqcdotsgeq d_n>0$, $d_i=d(v_i)$. Let ${A}=(a_{ij})_{ntimes n}$ and ${%D}=mathrm{diag }(d_1,d_2,ldots , d_n)$ be the adjacency and the diagonaldegree matrix of $G$, respectively. Denote by ${mathcal{L}^+}(G)={D}^{-1/2}(D+A) {D}^{-1/2}$ the normalized signles...
متن کاملThe Kinetics and Mechanisms of Substitution Reactions of Trans-[Co(en)2CNCl]+ in Binary Mixed Solvent
The kinetics and mechanisms of the substitution reactions of trans-[Co(en)2CNCl]+ with unidentate anions, , CN¯, I¯, , Br¯ and SCN¯ in 60% v/v DMF-H2O binary solvent at 40.0±0.2 °C were studied spectrophotometrically. An Id mechanism was assigned for the replacement of chlorine by , CN¯ and I¯, an Ia one for...
متن کاملMRA parseval frame multiwavelets in L^2(R^d)
In this paper, we characterize multiresolution analysis(MRA) Parseval frame multiwavelets in L^2(R^d) with matrix dilations of the form (D f )(x) = sqrt{2}f (Ax), where A is an arbitrary expanding dtimes d matrix with integer coefficients, such that |detA| =2. We study a class of generalized low pass matrix filters that allow us to define (and construct) the subclass of MRA tight frame multiwa...
متن کاملSome results on the energy of the minimum dominating distance signless Laplacian matrix assigned to graphs
Let G be a simple connected graph. The transmission of any vertex v of a graph G is defined as the sum of distances of a vertex v from all other vertices in a graph G. Then the distance signless Laplacian matrix of G is defined as D^{Q}(G)=D(G)+Tr(G), where D(G) denotes the distance matrix of graphs and Tr(G) is the diagonal matrix of vertex transmissions of G. For a given minimum dominating se...
متن کامل